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Abstract. A finite set A of integers is square-sum-free if there is no subset of 
A sums up to a square. In 1986, Erdos posed the problem of determining the 
largest cardinality of a square-sum-free subset of {1, . . . , n}. Answering this 
question, we show that this maximum cardinality is of order n 1 / 3 "* -0 ' 1 '. 



1. Introduction 



Let A be a set of numbers. We denote by Sa the collection of finite partial sums 
of A, 



For a positive integer I < \A\ we denote by I* A the collection of partial sums of I 
elements of A, 



Let [x] denote the set of positive integers at most x. In 1986, Erdos [4] raised the 
following question: 

Question 1.1. What is the maximal cardinality of a subset A of [n] such that Sa 
contains no square? 

We denote by SF(n) the maximal cardinality in question. Erdos observed that 





SF(n) = n(n 1/3 ). 



(1) 



To see this, consider the following example 
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Example 1.2. Let p be a prime and k be the largest integer such that kp < n. We 
choose p of order n 2 / 3 such that k = f^n 1 / 3 ) and 1 + • • • + k < p. Then the set 
A := {p, 2p, . . . , kp} is square-sum-free. 

Remark 1.3. The fact that p is a prime is not essential. The construction still 
works if we choose p to be a square-free number, namely, a number of the form 
p = pi . . . pi where pi are different primes. 

Erdos [4] conjectured that SF(n) is close to the lower bound in (1). Shortly after 
Erdos' paper, Alon [1] proved the first non-trivial upper bound 



SF ^ - o O- (2) 

Next, Lipkin [9] improved to 

SF(n) - 0(n 3 / 4+ °«). (3) 
In [2], Alon and Freiman improved the bound further to 

SF(n) = 0(n 2/3+o(1) ). (4) 
The latest development was due to Sarkozy [11], who showed 

SF(n) = O(Vnlogn). (5) 
In this paper, we obtain the asymptotically tight bound 

SF(n) = O(n 1 ^ +O ^ 1 '>). (6) 
Theorem 1.4. There is a constant C such that for all n > 2 

SF(n) <n^ 3 (log nf (7) 

In fact, we are going to prove the following (seemingly) more general theorem 

Theorem 1.5. There is a constant C such that the following holds for all suffi- 
ciently large n. Let p be positive integer less than n 2 / 3 (logn) _c ' and A be a subset of 
cardinality n 1 / 3 (logn) c of[n/p]. Then there exists an integer z such thatpz 2 e Sa- 



SQUARES IN SUMSETS 



3 



Theorem 1.4 is the special case when p = 1. Furthermore, Theorem 1.4 implies 
many special cases of Theorem 1.5. To see this, choose A to have the form A := 
{pb \b € B} where B is a subset of [n/p] and p is a square-free-number. Then 
finding a square in Sa is the same as finding a number of the form pz 1 in Sb- 

If one replaces squares by higher powers, then the problem becomes easier and 
asymptotic bounds have been obtained earlier (see next section). 

Notations. We use Landau asymptotic notation such as O, fi, 0, o throughout the 
paper, under the assumption that n — > oo. Notation such as 6 C (.) means that 
the hidden constant in 6 depends on a (previously defined) quantity c. We will 
also omit all unnecessary floors and ceilings. All logarithms have natural base. As 
usual, e(x) means exp(27ri;r) = cos27rx + ism2nx. 



2. The main ideas 

The general strategy for attacking Question 1.1 is as follows. One first tries to show 
that if \A\ is sufficiently large, then Sa should contain a large additive structure. 
Next, one would argue that a large additive structure should contain a square. 

In previous works [1, 2, 9, 11], the additive structure was a (homogeneous) arith- 
metic progression. (An arithmetic progression is homogeneous if it is of the form 
{Id, (I + l)d, ...,(/ + k)d}.) It is easy to show that if P is a homogeneous AP of 
length Com 2 / 3 in [m], for some large constant Co, then P contains a square. Notice 
that the set Sa is a subset of [m] where m := \A\n. Thus, if one can show that Sa 
contains a homogeneous AP of length C m 2 / 3 , then we are done. Sarkozy could 
prove that this is indeed the case, given \ A\ > C\\/n\ogn for a properly chosen con- 
stant C\. This also solves (asymptotically) the problem when squares are replaced 
by higher powers, since in these cases, the lower bound (which can be obtained by 
modifying Example 1.2) is Q(y/n). 

Unfortunately, y/n is the limit of this argument, since there are examples of a 
subset A of [n] of size Q(y/n) where the longest AP in Sa is much shorter than 
(|A|n) 2 / 3 . In order to present such an example, we will need the following definition 
(which will play a crucial role in the rest of the paper) 

Definition 2.1 (Generalized arithmetic progression-GAP). A generalized arith- 
metic progression of rank r is a set of the form 



Q = {«o + xiai + • • • + x r a r \0 < Xi < Li}. 

If all the sums X\a\ + • • • + x^ad are distinct, we say that Q is proper. If oq = 0, we 
say that Q is homogeneous. (Homogeneous arithmetic progression thus corresponds 
to the case r = 1.) We call L\, . . . , L r the sizes of Q and a\,...,a r its steps. 

Example 2.2. Consider 
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A := {qix\ + q 2 x 2 \l < Xi < N} 

where qi ~ q 2 ~ n 3 ^ are different primes and N — j^n 1 ^ 4 - It is easy to show 
that A is a proper GAP of rank 2 and Sa is contained in the proper GAP 



{qixi + q 2 x 2 \l < x, < 1 + • • • + N}. 

Thus, the longest AP in Sa has length at most 1 H h N = 6(n 1 / 2 ) ; while A has 

cardinality 0(n 1 / 2 ). 

The key fact that enables us to go below ^fn and reach the optimal bound n 1 / 3 is a 
recent theorem of Szemeredi and Vu [12] that showed that if \A\ > Cn 1 ' 3 for some 
sufficiently large constant C, then Sa does contain a large proper GAP of rank at 
most 2. 

Lemma 2.3. [12] There are positive constants C and c such that the following 
holds. If A is a subset of [n] of cardinality at least Cn 1 / 3 , then Sa contains either 
an AP Q of length c\A\ 2 or a proper GAP Q of rank 2 and cardinality at least c\A\ 3 . 

Ideally, the next step would be showing that a large proper GAP Q (which is a 
subset of [|A|n]) contains a square. Thanks to strong tools from number theory, 
this is not too hard (though not entirely trivial) if Q is homogeneous. However, we 
do not know how to force this assumption. 

The assumption of homogeneity is essential, as without this, one can easily run 
into local obstructions. For example, if Q is a GAP of the form 



{a + aixi + a 2 x 2 \0 < Xi < L} 

where both a\ and o 2 are divisible by 6, but a = 2 (mod 6), then clearly Q cannot 
contain a square, as 2 is not a square modulo 6. 

In order to overcome this obstacle, we need to add several twists to the plan. First, 
we are going to use only a small subset A' of A to create a large GAP Q. Assume 
that Q has the form 



{a + aixi + a2X2|0 < x l < L}. 

(Q can also have rank one but that is the simpler case.) Let q be the g.c.d of 
a\ and a 2 . If oq is a square modulo q, then there is no local obstruction and in 
principle we can treat Q as if it was homogeneous. 
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In the next move, we try to add the remaining elements of A (from A := A\A') 
to ao to make it a square modulo q. This, however, faces another local obstruction. 
For instance, if in the above example, all elements of A are divisible by 6, then ao 
will always be 2(mod6) no matter how we add elements from A to it. 

Now comes a key point. A careful analysis reveals that having all elements of A 
divisible by the same integer (larger than one, of course) is the only obstruction. 
Thus, we obtain a useful dichotomy: either Sa contains a square or there is an 
integer p > 1 which is divisible by all elements of a large subset A of A. 

Now we keep working with A" . We can write this set as {pb \b € B} where B is 
a subset of [n/p]. In order to show that S A " contains a square, it suffices to show 
that Sb contains a number of the form pz 2 . This explains the necessity of Theorem 
1.5. 

A nice feature of the above plan is that it also works for the more general problem 
considered in Theorem 1.5. We are going to iterate, setting new A := A of the 
previous step. Since the number of iterations (i.e., the number of p's) is only 
O(logn), if we have \A \ > (1 — n^f^rc )!^! m eacn step, for a sufficiently large 
constant c, then the set A will never be empty and this guarantees that the 
process should terminate at some point, yielding the desired result. 

In the next lemma, which is the main lemma of the paper, we put these arguments 
into a quantitative form. 

Lemma 2.4. The followings holds for any sufficiently large constant C. Let p be 
positive integer less than n 2 / 3 (logn)~ c ' and A be a subset of [n/p] of cardinality 
n 1 / 3 (logn) c . Then there exists A' C A of cardinality \A'\ < n 1 / 3 (logn) c '/ 3 such 
that one of the followings holds (with A" := A\A' ) 

• Sa> contains a GAP 

Q = {r + qx |0 < x < L} 

where L > n 2 / 3 (logn) c / 4 and q < " ' ( lo ^") and r = pz 2 (modg) for 
some integer z. 

• Sa> contains a proper GAP 

Q = {r + q(qiX! + q 2 x 2 ) |0 < x x < L u < x 2 < L 2 ,{q u q 2 ) = 1} 

such that min(Li,i 2 ) > n 1/3 (logn) c / 4 , L X L 2 > n(logn) c / 2 , q < (lo| ff /8p 
and r = pz 2 (m.odq) for some integer z. 

• There exists an integer d > 1 such that d\a for all a £ A" . 

Given this lemma, we can argue as before and show that after some iterations, one 
of the first two cases must occur. We show that in these cases the GAP Q should 
contain a number of the form pz 2 , using classical tools from number theory (see 
Section 9 and Section 10). 
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The proof of Lemma 2.4 is technical and requires a preparation involving tools from 
both combinatorics and number theory. These tools will be the focus of the next 
two sections. 

3. Tools from additive combinatorics 

This section contains tools from additive combinatorics, which will be useful in the 
proof of Lemmas 3.6 and 2.4. Let X,Y be two sets of numbers. We define 

X + Y := {x + y \x e X, y e Y}; X - Y := {x - y \x e X, y e Y}. 

A translate of a set X is a set X' of the form X 1 := {a + x \x € X}. For instance, 
every GAP is a translate of a homogeneous GAP. 

The first tool is the so-called Covering lemma, due to Ruzsa (see [10] or [13, Lemma 
2.14]). 

Lemma 3.1 (Covering Lemma). Assume that X, Y are finite sets of integers. Then 
X is covered by at most \X + Y\/\Y\ translates ofY — Y. 

The second tool is the powerful inverse theorem of Freiman [5], [13, Chapter 5] 

Lemma 3.2 (Frciman's inverse theorem). Let 7 be a given positive number. Let 
X be a set in Z such that \X + X\ < j\X\. Then there exists a proper GAP P of 
rank at most d = d(j) and cardinality 7 (\X\) that contains X. 

Freiman's theorem has the following variant ([5, 12], [13, Chapter 5], which has a 
weaker conclusion, but provides the optimal estimate for the rank d. This lemma 
played a key factor in [12]. 

Lemma 3.3. Let 7 < 2 d be a given positive number. Let X be a set in Z such 
that \X + X\ < j\X\. Then there exists a proper GAP P of rank at most d and 
cardinality Oj(\X\) that contains X . 

This lemma will not be sufficient for our purpose here. We are going to need 
the following refinement, which can be proved by combining Lemma 3.3 and the 
Covering lemma. 

Lemma 3.4. [7] [13, Chapter 5] Let 7, S be positive constants. Let X be a set in 
Z such that \X + X\ < 7|A|. Then there exists a proper GAP P of rank at most 
[log 2 7 + S\ and cardinality 0~ ( s(\X\) such that X is covered by 7 ,5(1) translates 
of P. 

We say that a GAP Q = {ao + x\a\ + . . . Xdad\0 < Xi < Li} is positive if its steps 
dj's are positive. A useful observation is that if the elements of Q are positive, then 
Q itself can be brought into a positive form. 
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Lemma 3.5. A GAP with positive elements can be brought into a positive form. 



Proof (Proof of Lemma 3.5.) Assume that 

Q = {a + Xidi + ... x d a d \0 <Xi< Li}. 

By setting Xi — 0, we can conclude that a > 0. Without loss of generality, 
assume that a\, . . . , aj < and a J+1 , . . . , a d > 0. By setting x^ = for all i > j 
and Xi — Li,i < j, we have 

a' := clq + a\L\ + . . . ajLj > 0. 
Now we can rewrite Q as 



Q ■= Wo + H h Xj(-aj) + x j+1 a j+1 + . . .x d a d \0 < x l < Li}, 

completing the proof. ■ 

Since we only deal with positive integers, this lemma allows us to assume that all 
GAPs arising in the proof are in positive form. 

Using the above tools and ideas from [12], we will prove Lemma 3.6 below, which 
asserts that if a set A of [n/p] is sufficiently dense, then there exists a small set 
A' C A whose subset sums contain a large GAP Q of small rank. Furthermore, the 
set A" — A\A' is contained in only a few translates of Q. This lemma will serve 
as a base from which we will attack Lemma 2.4, using number theoretical tools 
discussed in the next section. 

Lemma 3.6. The following holds for all sufficiently large constant C. Let p be 
positive integer less than n 2 / 3 (logn) _c and A be a subset of [n/p] of cardinality 
n 1 / 3 (logn) c '. Then there exists a subset A' of A of cardinality \A'\ < n 1 / 3 (logn) c '/ 3 
such that one of the followings holds (with A" := A\A' ): 

• Sa> contains an AP 

Q = {r + qx |0 < x < L} 

where L > n 2 / 3 (logn) c / 2 and there exist m = 0(1) different numbers 
si,...,s m such that A" C {si, . . . , s m } + Q. 

• Sa' contains a proper GAP 

Q = {r + aixi + a 2 x 2 ) |0 < X\ < L\,Q < x 2 < L 2 

such that L\L 2 > n(logn) c / 2 } and there exists m = 0(1) numbers si,...,s„ 
such that A" C {si, . . . , s m } + Q. 
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Remark. The proof actually gives a better lower bounds for L1-L2 in the second 
case (2C/3 instead of C/2), but this is not important in applications. 



Fourier Transform and Poisson summation. Let / be a function with support on 
Z. The Fourier transform / is defined as 



For more details, we refer to [8, Section 4.3]. 

Smooth indicator functions. We will use the following well-known construction (see 
for instance [6, Theorem 18] for details). 

Lemma 4.1. Let S < 1/16 be a positive constant and let [M, M + N] be an interval. 
Then there exists a real function f satisfying the following 



A Weyl type estimate. Next, we need a Weyl type estimate for exponential sums. 

Lemma 4.2. For any positive constant e there exist positive constants a = a(e) 
and c(e) such that the following holds. Let a, q be co-prime integers, 9 be a real 
number, and L be an interval of length N. Let M be a positive number such that 
MN > q 1+E . Then, 



4. Tools from number theory 




The classical Poisson summation formula asserts that 




• < f(x) < 1 for any x G R. 

• f( x ) = if x < M or x > M + N. 

• f {x) = 1 if M + SN < x < M + N(l — S). 

• |/(A)| < 16/(0) exp(- ( 5|AA^| 1 /2) f or every a. 



\m\<M zel 




+ 6mz)\ < c(mVn + 



MN 



V9 



) (log MN) a . 



Quadratic residues. Finally, and most relevant to our problem, we need the follow- 
ing lemma, which shows the existence of integer solutions with given constrains for 
a quadratic equation. 
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Lemma 4.3. There is an absolute constants D such that the following holds. Let 
ai, . . . ,dd,r,p,q be integers such that p,q > and (oi, . . . , ad, q) = 1. Then the 
equation 

a\X\ H h a d x d + r = pz 2 (modq) (9) 

has an integer solution (z,Xi, . . .Xd) satisfying < Xi < (pq) 1 ^ 2 (log q) D . 

The rest of the paper is organized as follows. The proof of the combinatorial state- 
ment, Lemma 3.6, comes first in Section 5. We then start the number theoretical 
part by giving a proof for Lemma 4.2. The verification of Lemma 4.3 comes in 
Section 7. After all these preparations, we will be able to establish Lemma 2.4 in 
Section 8. The proof of the main result, Theorem 1.5, is presented in Sections 9 
and 10. 



5. Proof of Lemma 3.6 

We repeat some arguments from [12] with certain modifications. The extra infor- 
mation we want to get here (compared with what have already been done [12]) is 
the fact that the set A is covered by only few translates of Q. 

5.1. An algorithm. Let A' be a subset of cardinality \A'\ = n 1 / 3 (logn) c '/ 3 and let 
A" := A\A'. By a simple combinatorial argument (see [12, Lemma 7.9]), we can find 
in A' disjoint subsets A[, . . . , A' mi such that \A\\ < 201og 2 \A'\ and \qA^\ > \A'\/2 
where 



h < 101og 2 \A'\ and mi = |A'|/(401og 2 \A'\). (10) 

(For the definition of I* A see the beginning of the introduction.) 

Without loss of generality, we can assume that mi is a power of 4. Let Bi, ... , B mi 
be subsets of cardinality b\ — \A'\/2 of the sets l\A\, . . . , It A' respectively. Fol- 
lowing [12, Lemma 7.6]), we will run an algorithm with the -Bj's as input. The goal 
of this algorithm is to produce a GAP which has nice relations with A (while still 
not as good as the GAP we wanted in the lemma) . In the next few paragraphs, we 
are going to describe this algorithm. 

At the first step, set B\ :=B 1 ,..., B} ni := B mi and let Q3 1 = {B{, . . . , B^J. Let 
h be a large constant to be determined later. 

At the (t + l)-th step, we choose indices i,j and elements a\, . . . ,ah <E A" that 
maximizes the cardinality of U^ =1 (£>| + Bj + ad) (if there are many choices, choose 

one arbitrarily). Define B\ +1 to be the union. Delete from A" the used elements 
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ai,...,an, and remove from <B l the used sets Find the next maximum 

union U^ =1 B* + £?* + a k with respect to the updated sets 25* and A". 

Assume that we have created m t +\ := m t /4 sets £?* +1 , . . . , B l +^' . By the algo- 
rithm, we have 

|S*+ 1 '|>...>| J B^/|:=6 m . 

Now for each 1 < i < m t+1 we choose a subset B\ +1 of cardinality exactly b t +\ in 
B\ +l . These m t+ i sets (of the same cardinality) from a collection <8 t+1 , which is 
the output of the (t + l)-th step. 

Since mt+i = nit/ '4, there are still nrit/2 unused sets B\ left in <8 l . Without loss of 
generality, assume that those are B[, . . . , B 1 m ^ 2 . With a slight abuse of notation, 

we use A at every step, although this set loses a few elements each time. (The 
number of deleted elements is very small compared to the size of A .) 

Let l t +i ■— 2l t + 1. Observe that 

• It < 2*/i (by definition); 

• b t < hn/p (since U§ =1 (^ _1 + S*" 1 + a d ) C [Z t n/p]); 
• 

luiiBf + Bj + adl <6t +1 (11) 

for all 1 < i < j < mt/2 and ai, . . . , an € A" (by the algorithm, as it always 
chooses a union with maximum size). 

Now let c be a large constant and k be the largest index such that bi > cbi-i for 
alH < k. Then we have 

c k b x <b k < hn/p. 
Since b\ = \A'\/2 and l k < 2 fc /i, we deduce an upper bound for k, 

^^■bfp 

Next, by the definition of k, we have b k +i < cbk- By (11), the following holds for 
all unused sets Bf,Bj (with 1 < i < j < nik/2) and for all a\, . . . , ah e A": 



| uJU (B? + 4 + a d )| < 6 fe+ i < cb k = c\B<? 
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In particular 

\B k + B k \<c\B k \ 

holds for all 2 < i < 

By Plunnecke-Ruzsa estimate (see [13, Corollary 6.28]), we have 

\B k + B k \ <c 2 \B k \. 

It then follows from Freiman's theorem, Theorem 3.2, that there exists a proper 
GAP R of rank O c (l), of size O c (l)|5j c | such that R contains B k . Furthermore, by 
Lemma 3.1, B k is contained in c translates of B\ — B\ , thus B k is also contained 
in O c (l) translates of R. 

Before continuing, we would like to point out that the parameter h has not yet 
played any role in the arguments. The freedom of choosing h will be important in 
what follows. We are going to obtain the desired GAP Q (claimed in the lemma) 
from R by a few additional operations. 

5.2. Creation of many similar GAPs. One problem with R is that its cardinal- 
ity can be significantly smaller than the bounds on Q in Lemma 3.6. We want to 
obtain larger GAPs by adding many translates of R. While we cannot do exactly 
this, we can do nearly as good by the following argument, which creates many 
GAPs which are translates of each other and have cardinalities comparable to that 
of R. 

By the pigeon hole principle, for i < m^/2, we can find a set B\ C B k with 
cardinality c (l)6fc which is contained in one translate of R. 

By [12, Lemma 5.5], there exists g — O c (l) such that B[ + ■ ■ - + B' g contains a proper 
GAP Qi of cardinality 8 c (l)|i?|. Create Qi by summing B' g+1 , . . . , B' 2g , and so on. 
At the end we obtain ^ = 6 c (l)m fe such GAPs. Following [12, Lemma 5.5], we 
can require the Qi's to have the properties below 

• rank(Qi) =rank(i?) = O c (l); 

• \Qi\ = e c (i)|i?| = e c (i)6 fc ; 

• each Qi is a subset of a translate of gR. Thus by Lemma 3.1, R is contained 
in O c (l) translates of Qi — Qi, 

• the j-th size of Qi is different from j-th size of R by a (multiplicative) factor 
of order 6 C (1), for all j; 

• the j-th step of Qi is a multiple of the j-th step of R for all j; 
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Thus, by the pigeon hole principle and truncation (if necessary) we can obtain 
m' = Q c (mk) GAPs, say Qi, ■ ■ ■ ,Q m ', which are translate of each other. An 
important remark here is that since the Qi are obtained from summing different 
B's, the sum Q\ + - ■ ■ + Q m ' is a subset of Sa>- The desired GAP Q will be a subset 
of this sum. 

5.3. Embedding A". In this step, we embed A in a union of few translates of a 
GAP Qi of constant rank. 

We set the (so far untouched) parameter h to be sufficiently large so that 

Q c (l) = h>c\B k 1 \/\B' 1 \. 

Let d be the largest number such that there are d elements ai , . . . , a<j of A" for 
which the sets B[ + B' 2 + ai are disjoint. Assume for the moment that d> h, then 
we would have 

I U? = i (B[ +B' 2 + ai ) = h\B[ + B' 2 \> h\B[\ > c\B k \ 

However, this is impossible because U^ =1 (B[ + B 2 + a { ) C L# =1 {B k + B 2 + a t ) and 
the latter has cardinality less than c\B k \ by definition. Thus we have d < h. So 
d = O c (l). 

Let us fix d elements ai,...,a<j from A" which attained the disjointness in the 
definition of d. By the maximality of d, for any a e A" there exists ai so that 
(B[ +B' 2 + a)n (B[ +B' 2 + ai ) ^ 0. Hence 

a - ai e Bl + B k 2 - (B\ + B k 2 ) = {B* - B k ) + (B 2 - B 2 ) C2R- 2R. 

Thus A" is covered by at most d = O c (l) translates of 2R — 2R. On the other hand, 
since R is contained in O c (l) translates of Qi — Qi, 2R— 2R is contained in O c (l) 
translates of AQi — ^Q\. It follows that that A" is covered by O c (l) translates of 
Qi- 

The remaining problem here is that Q\ does not yet have the required rank and 
cardinality. We will obtain these by adding the Qi together (recall that these GAPs 
are translates of each other) and using a rank reduction argument, following [12] 
(see also [13, Chapterl2]). 

5.4. Rank reduction. Let P be the homogenous translate of Q\( and also of 
Q2, ■ ■ ■ , Qm')- Recall that 



P\ = \Qi\ = e c {b k )=n c {c k b 1 ). 
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and also 

m' = e c (m k ) = 6 C (^), and l k+1 < 2 k+1 h. 

Set I := min{m',|A'|/2Zfc+i}. Recall that \A'\ = ^^{Xog n) c ' 2 , h < 101og 2 |A'| 
and 61 = |.A'|/2. By choosing c and C sufficiently large, we can guarantee that 



l\P\ > n 2 / 3 (logn) c / 2 ; l 2 \P\ > n(logn) 2C / 3 . (12) 

and also 



Z 3 |P| >n 4/3 (logn) c (13) 

Now we invoke Lemma 3.4 to find a large GAP in IP. Assume, without loss of 
generality, that I = 2 s for some integer s. We start with Pq := P and £q := I. If 
2 s Po is proper, then we stop. If not, then there exists a smallest index i\ such that 
2 ll P is proper but 2 ll+1 P Q is not. 

By Lemma 3.4 (applying to 2 n Po; see also [12, Lemma 4.2]) we can find a GAP 
S which contains a O c (l) portion of 2 n P such that rank(S) < r := rank(2 tl P ). 
We denote by P' the intersection of S and 2 ll P . 

By [12, Lemma 5.5], there is a constant g = 6 C (1) such that the set 2 9 P' contains 
a proper GAP Pi of rank equals rankS and cardinality C (1)|2 I1 P O |. Set l\ := 
£ /2 ll+g if £ /2 ll+g > 1 and proceed with P\,£i and so on. Otherwise we stop. 

Observe that if 2 i *P j is proper, then |2^Pj| = (1 + o(l))2^ r i\Pj\, where rj is the 
rank of Pj . 

As the rank of Pq is O c (l), and r J+ i < rj — 1, we must stop after O c (l) steps. 
Let Q' be the proper GAP Q' obtained when we stop. It has rank d', for some 
integer d! < r and cardinality at least O c (l )tf\Po\ = e c (l)l d '\P\- On the other 
hand, since a translate of IP is contained in Sa>, \Q'\ < \A'\n/p < \A'\n, that is 
6 c (l)Z d '|P| < \A'\n. Because of (13), this holds only if d! < 2. 

5.5. Properties of Q. We showed that A" is contained in O c (l) translates of Q\, 
thus it is contained in 6 C (1) translates of 2 ll P . 

By Lemma 3.4, 2' 11 Pq is covered by O c (l) translates of S. It follows that A" 
is contained in 6 C (1) translates of S. On the other hand, by Lemma 3.1, S is 
contained in O c (l) translates of P' — P' . Thus A" is contained in O c (l) translates 
of P' — P', and hence in 6 C (1) translates of (2 9 P') — (2 9 P r ) as well. Furthermore, 
by Lemma 3.1, 2 9 P' is covered by O c (l) translates of Pi (more precisely, Pi — Pi), 
thus we conclude that A" is covered by O c (l) translates of Pi. Because we stop 
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after C (1) steps, a similar relation is also valid between A" and any Pj. Thus, A 
is covered by O c (l) translates of Q'. 

Furthermore, Q' is a subset of IP. Thus a translate Q of Q' lies in Q\ H h Q m > C 

SU'- This Q has rank 1 < d! < 2 and cardinality |Q| = \Q'\ > e(l)l d '\A'\. (The 
right hand side satisfies the lower bounds claimed in Lemma 3.6, thanks to (12).) 
This is the GAP claimed in Lemma 3.6 and our proof is complete. 

6. Proof of Lemma 4.2 

If q is a prime, the lemma is a corollary of the well known Weyl's estimate (see [8]. 
We need to add a few arguments to handle the general case. The following lemma 
will be useful. 

Lemma 6.1. Let r(n) be the number of positive divisors of n. For any given k > 3 
there exists a positive constant (3{k) such that the following holds for every n. 

r{n) = O k { r{df^). 

d\n 

Proof (Proof of Lemma 6.1). We can set (3(k) = klog(k +1). We factorize n in 
the following specific way 



" 1R 1R 

i=i j=i 

where pi < ■ ■ ■ < p u , Qi < ■ ■ ■ < Qv arc primes and aj > k > bj > 1. Set 

d: =n^ J n «■ 

Then d < n 1 ^ by definition and 



{k+l) k T{df^ = (fc+l) fc 2LiJ fcl °S( fc+1 ) J|(L^j + l) fel °g( fe +l) > (jfe+1)" JJ(l+Oj) > T 

i=i k »=i 
completing the proof. ■ 

Now we start the proof of Lemma 4.2. Let S :— J2 \m\<M I J2 z ei e(^^- + 9mz)\. 

Following Weyl's argument, we use Cauchy-Schwarz and the triangle inequality to 
obtain 
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S 2 <2M e( ^ - 1 +6m{z l - z 2 )). 

\m\<M z 1 ,z 2 £l ^ 
m=iO 



For convenience, we change the variables, setting u := z\ — z 2 , v :— z 2 , then 



^ \ - \ - ,amu 2 n . x - ,2amuv. 

s 2 < 2M Y.<— +6mu ) E e{ —^^ 

\m\<M \u\<N V vel,vel-u V 

m^O 



„ „ , \ \ . \ . 2amuv . . 

< 2M E Ei E e (^— )i- 

\m\<M \u\<N vel,vel-u q 



Next, using the basic estimate (see [8, Section 8.2], for instance) 



we obtain that 



< 2M V V min(iV, — 

^ ^ v ' \\2amu q\\ 

\rn\<M \u\<N 1 ' HU 

m=£0 



To estimate the right hand side, let N r be the number of pairs (m, u) such that 
2amu = r(modg). (In what follows, it is useful to keep in mind that a and q are 
co-primes.) We have 



S{M,N,qf < 2M ( N N+ (N r + N q - r )^\ 

\ l<r<q/2 T J 



(14) 



To finish the proof, we are going to derive a (uniform) bound for the N r 's. For 
< r < g — 1 let < r a < q — lbe the only number such that ar a = r(modg). 
Thus 2amu = r(modg) is equivalent with 2mu = r a (modg). 

First we consider the case r ^ 0, thus r a ^ 0. Write 2mu = r a + sq. It is clear that 
r a + sq ^ for all s. Since 2mu < 2MN, we have \s\ < 2MN/q. For each given s 
the number of such pairs (m, u) is bounded by r(r a + sq). 
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Choose k = max(± + 2, 3), then MN/q > (MN) 2 l k by the assumption MN > q 1+£ . 
It follows from Lemma 6.1 that, for r ^ 0, 



N r < ]T r(r a + sq) = £ ( ]T r{df^{ ]T 1) 

\s\<2MN/q d<(MNy/ k \s\<AMN/q 

d\r a -\-sq 

= O e ( £ r(d)^(^f+O m 

d<(MNy/ k 

q ' a 

d<(MN) 1 / k 

_ n( MN v r(d)^ fc ) 

d<(MN) 1 / k 

Notice that J2d<x i~(d)PW <g; x \og^ ^ x for some positive constant (3'{k) depending 
on (j(k) (see [8, Section 1.6], for instance). By summation by parts we deduce that 



N r =O s (^log^ k \MN)) 
for some positive constant (3"(k) depending on (3'(k). 

Now we consider the case r = 0. The equation 2mu — sq has at most r(sq) solution 
pairs (m, u), except when s = 0, the case that has 2M solutions {(m, 0); |m| < 
2M, m/0}. Thus we have 



JV < 2M + T ( s «)' 

|s|<2MiV/ 9 ,s^0 

and hence, 



N Q = O e (2M + — log /J " (fc) (MiV)). 



Combining these estimates with (14), we can conclude that 

S(M, N, q) < £ (MVN + MN/y/q) log" (MN) 



for some sufficiently large constant a = a(e). 
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7. Proof of Lemma 4.3 
We are going to need the following simple fact. 

Fact 7.1. Let a\, . . . , a m , q be integers such that (ai, . . . , a m , g) = 1. Then we 
can select a decomposition q — qi . . . qi of q and I different numbers , . . . , of 
{ai, . . . , a m } (7or some I > 1) such that 

(qi, qj) = l for evey i ^ j and (a i] ,qj) = l for every j. 

Proof (of Fact 7.1) Let q = q[ . . .q' k be the decomposition of q into prime powers. 
For each q\ we assign a number a- from {oi, . . . , a m } such that {q'i,^) = 1 (the 
same a* may be assigned to many q'-). Let 's be the collection of the a-'s without 
multiplicity. Set qj to be the product of all q\ assigned to a ij . ■ 

The core of the proof of Lemma 4.3 will be the following proposition, which is 
basically the case of one variable in a slightly more general setting. 

Proposition 7.2. There is a constants D such that the following holds. For given 
integers g, h,p,t, Z\; g, h,p > there exist integers x € [0, (ph) 1 / 2 (log h) D ] and Zi 
such that gx + pz\ + tk = pz 2 (modh), where k = (g,h) . 

Lemma 4.3 follows from Fact 7.1 and Proposition 7.2 by an inductive argument. 
Indeed, by the above fact we may assume that q = q\ . . . qi where (ai,qi) = 1, and 
so 



{ai,q)\qi ■ --qi-i- 

Now if Lemma 4.3 is true for I — 1 variables, i.e. there are appropriate x±, . . . , xi-i 
such that a\X\ + . . . a;_ia;;_i + r = pz\ + tqi . . . qi-i- Then we apply Proposition 
7.2 for q = h,g = ai to find xi. It thus remains to justify Proposition 7.2. 

Proof (of Proposition 7.2) Without loss of generality we assume that h > 3. As 
k = (g, h), we can write g — ka, h = kq where (a, q) = 1. We shall find a solution 
in the form z 2 = z\ + zk. Plugging in z 2 in this form and simplifying by k, we end 
up with the equation 

ax + t = pkz 2 + 2pziz(modq). 

or equivalently, 

x = apkz 2 + 2apz\z — at(modq) (15) 
where a is the reciprocal of a modulo q, aa = l(modg). 
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Our task is to find x € [0, (ph) 1 ' 2 (log h) D ] such that (15) holds for some integer z. 
Notice that if q is small and D is large then (ph) 1 / 2 (log h) D > (log3) D , therefore 
the interval [0, (ph) 1 / 2 ] contains every residue class modulo q; as a result, (15) holds 
trivially. From now on we can assume that q is large, 



g >cxp(l6(6(a + l)/e)" +1 ) (16) 

where c, a are constants arising from Lemma 4.2 with e = 1/3. 
Let s — (pk, q); so we can write pk = sp' , q = sq' with (p 1 , q') = 1. 
Let D be a large constant (to be determined later) and set 

L := (sq) 1 / 2 (log q) D /2 and / := [L,2L]. 

Note that 

ph = pkq = sp ' q > sq. 

Thus we have 



Id[0,(ph) 1/2 (logh) D ]. 

Let / be a smooth function defined with respect to the interval I (as in Lemma 
4.1). For fixed z € [1, q] the numbers of x in [0, (sq) 1 / 2 log D q] satisfying (15) is at 
least 



N z := f(apkz 2 + 2apz\z — at + mq). 

By Poisson summation formula (8) 

v—* 1 c-.m. . (apkz 2 + lapz^z — at)m. 

^ = E -/( 7 m )• 

By summing over z £ [1, q] we obtain 



N:=J2N Z = -J2 /(-) E e( ^ 2 + 2 ^^-^ m ). 

* = 1 q mG Z 9 2 = 1 9 
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To conclude the proof, it suffices to show that N > 0. We are going to show (as 
fairly standard in this area) that the sum is dominated by the contribution of the 
zero term. 

By the triangle inequality, we have 



Let 71 , 72 be a sufficiently large constant and let 

7i<?(log<?) 72 



L' :-- 



Set 



S ^-- q 2. 1/(7)112. e ( g > 

H \m\>L' H z=l H 



and 



S 2:= I V |/(-)|| j-e( { ~ apkz2 + 2 ~ apZlZ)m )\. 
q ^-^ q ^— ' q 

\m\<L' * 2=1 

m=£0 



Wc then have 



\N-f(0)\<S 1 +S 2 . 

In what follows, we show that both Si and S2 are less than /(0)/4. 
Estimate for Si . It is not hard to show that 



exp(-y / s[fc|) < - for < x < 1. 

feeZ 



To see this, observe that 
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y^exp(— Vxk) < / cxp(— \fxi)dt = — , 



where the integral is evaluated by changing variable and integration by parts. 
Thus 

E oxp(-V^) < E exp(-v^(^f^)) < f exp(-^). 
|fe|>fe fee z 

From the property of / (Lemma 4.1) we can deduce that 

Si < 16/(0) E expMv^Vol), 

\m\> yi ^ lo g 9)72 

which, via (17) and since g > 3, implies 

20 ^(Tidogo) 72 ) 1 / 2 
Si < 16/(0)-^ exp(- l7H S 2 gj j ) < /(0)/4, 

given that we choose 71,72 sufficiently large. 
Estimate for S2 ■ We have 



c /(0) \r- i a P >z2 , 2 aP z i 2TO M 

|m|<L z — 1 
m^O 



We shall choose Z? > 72. 
Set 



,6( J D-7 2 Kr>- 72 

:= I : ) 



First, we observe that 



_ 2 7 i g 2 (log gp _ 2 7 ig 3 / 2 2 7 ig' 1/2 g /4/3 7ig 1/6 
9 — / M/9/1 \n ~ .1/9/1 \n-vo - n \n--vo ^9 "~ " " 



(sq) 1 / 2 (log q) D sV2(l gg)^-72 (logg)^^-^ (logg) B -^ 
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It is not hard to show that the function g 1 / 6 /(log q) D 72 , where q > 3, attains its 
minimum at q = cxp(6(£> — 72)). Therefore, by the choice of 71, we have 



L'q>q' 4/3 . 

Next, Lemma 4.2 applied for e = 1/3 (and with the mentioned c and a) yields 



S = B-l V Vcf 8 ^ 2 I 2a P ZlZm \\ 
2 ~~ q ^ ^ q' q 

\rn\<L' z=l 

<c^(| + ^)(log 9 )« 

g V7 =2c ^r( lo g?) • 



It follows that 



4 C 7ig(logg) a+ ^ y 4c7 1 (log g )"+^ ~ 
* 2 " (^W /(0) = dog,)- /(0) - 



Now we choose £>,72 so that D — 72 — a = 1. Thus 71 = (6(a + l)/e) a+1 , and 



(logg)^ log g 



where the last inequality comes from (16). 



Remark 7.3. We can also use Burgess estimate to have an alternative proof with a 
slightly better bound. However, an improvement in this section does not improve 
the main theorem. 



8. Proof of Lemma 2.4 

We first apply Lemma 3.6 to obtain a large proper GAP Q of rank 1 or 2. By this 
lemma, we have A C {si, . . . , s m } + Q, where m is a constant. 

Let Si = A" n (si + Q) for 1 < i < m. We would like to guarantee that all Si are 
large by the following argument. 
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If Si is smaller than n 1 / 3 (logn) 3C '/ 10 , then we delete it from A and add to A'. 
The new sets A', A and Q still satisfy the claim of Lemma 3.6. On the other hand, 
that the total number of elements added to A' is only 0(n 1 / 3 (logn) 3C / 10 = o(|A'|), 
thus the sizes of A' and A" hardly changes. 

From now on, we assume that |5»| > n 1 / 3 (logn) 3C / 10 for all i. 

For convenience, we let 



s'i := s, + r. 

Thus every element of Si is congruent with s- modulo q. 



8.1. Q has rank one. In this subsection, we deal with the (easy) case when Q has 
rank one. We write Q = {r + qx |0 < x < L} where L > n 2 / 3 (logn) c / 2 . 

Since Q C Sa< C \A'\], we have 



L4> n 2 / 3 
^ ~ pL ~ (logn) c / 6 p 

By setting C (of Lemma 3.6) sufficiently large compared to D (of Lemma 4.3), we 
can guarantee that 



{ P qfl 2 {\ogq) D Kn 1 ' 5 . (18) 

Let d := (si + r, . . . , s m + r, q) = (s[, . . . , s' m , q). If c? > 1 then all elements of A" 
are divisible by d, since A are covered by {si, . . . , s m } + Q. Thus we reach the 
third case of the lemma and are done. 

Assume now that d = 1. By Lemma 4.3, we can find < x l < (pqY^ilogq) such 
that 



s[xi H h s^m + f = p2: 2 (modg). (19) 

Pick from S^s exactly Xi elements and add them together to obtain a number s. 
The set s + Q is a translate of Q which satisfies the first case of Lemma 2.4 and we 
are done. 
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8.2. Q has rank two. In this section, we consider the (harder) case when Q has 
rank two. The main idea is similar to the rank one case, but the technical details 
are somewhat more tedious. We write 



Q = r + q(qix + q 2 y)\0 < X < L\, < y < L 2 
where L X L 2 = \Q\ > n\og 2C/3 n. 

As Q is proper, either qi > L 2 or q 2 > L\ holds. Thus qL\L 2 < \A'\n/p, which 
yields (with room to spare) 



n 1/3 

q < j, ^7^- (20) 

(logn) / b p 

We consider two cases. In the first (simple) case, both L\ and L 2 are large. In the 
second, one of them can be small. 

Case 1. min(Li, L 2 ) > n 1 / 3 (logn) C/ ' 4 . Define d := (s[, . . . , s' m , q) and argue as in 
the previous section. If of > 1, then we end up with the third case of Lemma 2.4. 
If d = 1 then apply Lemma 4.3. The fact that q is sufficiently small (see (20)) and 
that | | is sufficiently large guarantee that we can choose Xi elements from SV At 
the end, we will obtain a GAP of rank 2 which is a translate of Q and satisfies the 
second case of Lemma 2.4. 

Case 2. min(Li,L 2 ) < n 1 / 3 (logn) c '/ 4 . In this case the sides of GAP Q are 
unbalanced and one of them is much larger than the other. We are going to exploit 
this fact to create a GAP of rank one (i.e., an arithmetic progression) which satisfies 
the first case of Lemma 3.6, rather than trying to create a GAP of rank two as in 
the previous case. 

Without loss of generality, we assume that L\ < n 1/,3 (logn) c / 4 . By the lower 
bound on L\L 2 , we have that L 2 > n 2 / 3 (log n) c / 4 . This implies 



\A'\n n 2 / 3 

qq2 < —r~ < 



pL 2 (\ogn) c / 12 p 
Again by setting C sufficiently large compared to D, we have 



(pqq 2 ) 1/2 \\og qq 2 ) D < n 1 / 3 (log nf'\ (21) 

Creating a long arithmetic progression. In the rest of the proof we make use of A" 
and Q to create an AP of type {r 1 + qq 2 x 2 |0 < x 2 < L 2 , r' = pz 2 (modqq 2 )} . This 
gives the first case in Lemma 3.6 and thus completes the proof of this lemma. 
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Let S be an element of {Si, . . . ,8m}. Since S is contained in a translate of Q, 
there is a number s such that any a £ S satisfies a = s + tqq\(raodqq 2 ) for some 
< t < L\ (for instance, if a £ <Sj then a = s' { + tqqi(modqq 2 j). Let T denote 
the multiset of i's obtained this way. Notice that T could contain one element of 
multiplicity |5|. Also recall that \S\ > n 1 / 3 (logn) 3C / 10 . 

For < I < \S\/2, let m; and M x (respectively) be the minimal and maximal values 
of the sum of I elements of T. Since < t < L\ for every t £ T, by swapping 
summands of m; with those of Mi, we can obtain a sequence m; = n < ■ ■ ■ < ni — 
Mi where each nj £ l*T and n i+ i — m < L\ for all relevant i. 

By construction, we have 



[m u Mi] c{no,...,n,} + [0,Li] C/*T+[0,Li]. (22) 

Next we observe that if Z is large and M; — m; is small, then T looks like a sequence 
of only one element with high multiplicity. We will call this element the essential 
element of T. 

Proposition 8.3. Assume that i(n 1 / 3 (logn) 3C / 10 < I < in 1 / 3 (logn) 3C / 10 and 
Mi - mi < in 1/3 (logn) 3C/10 . Then all but at most in 1/3 (logn) 3C/10 elements of 
T are the same. 

Proof (Proof of Proposition 8.3) Let t\ < t 2 < • • • < t\ be the I smallest elements 
of T and t\ < ■ ■ ■ < t\ be the I largest. By the upper bound on I and lower bound 

on \S\ = |T|, t[ > t x . On the other hand, M x - m x = (t[ — t\) H h (tj — t t ). Thus 

if Mi~mi< in 1 / 3 (logn) 3C / 10 < I - 1 then t\ = U for some i. The claim follows. ■ 

The above arguments work for any S among S± , . . . , S m . We now associate to each 
Si a multiset Tj, for all 1 < i < m. 

Subcase 2.1 The hypothesis in Proposition 8.3 holds for all Tj. In this case 
we move to A' those elements of Si whose corresponding parts in Tj is not the 
essential element. The number of elements moved is only 0(n 1 / 3 (logn) 3C '/ 10 ), 
which is negligible compared to both \A'\ and \A"\. Furthermore, the properties 
claimed in Lemma 3.6 remain unchanged and the size of new Si are now at least 
in^logn) 3 ^ 10 . 

Now consider the elements of A" with respect to modulo qq 2 . Since each Tj has only 
the essential element, the elements of A produces at most m residues Uj = s^+Uqqi, 
each of multiplicity at least 

\Si\ > V/ 3 (logn) 3C / 10 > (pqq 2 ) 1 / 2 (log qq 2 ) D 

where the last inequality comes from (21). Define d — (ui, . . . , u m , qq 2 ) and proceed 
as usual, applying Lemma 4.3. 
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Subcase 2.2 The hypothesis in Proposition 8.3 does not hold for all Tj. We can as- 
sume that, with respect to Ti, Mj-mj > in 1 / 3 (logn) 3C ' /:L0 for all in 1 / 3 (logn) 3C / 10 < 
I < in 1 / 3 (logn) 3C '/ 10 . From now on, fix an I in this interval. 

Next, for a technical reason, we extract from S\ a very small part S[ of cardinality 
n 1 / 3 (logn) c / 5 and set S 1 = Si\S[. Let T be the multiset associated with S 1 . We 
can assume that T satisfies the hypothesis of this subcase. 

Define d :— (s[, . . . , s' m , q). As usual, the case d > 1 leads to the third case of 
Lemma 2.4, so we can assume d = 1. By Lemma 4.3, there exist integers 

< x t < {pq^^ilogn) < nWQognf' 6 < |^| 
and k,zi such that 



s[xi H h s' m x m + (ls[ + r) = pz\ + kq. (23) 

For i > 2 we pick from Si exactly Xi elements a\, . . . ,a l x ., and for i = 1 we pick x\ 
elements a{, . . . , from ^ and add them together. By (23) the following holds 
for some integer k', 



m Xi 

t=i j=i 

Furthermore, by Proposition 7.2, as g = (qqi,qq2), there exist < x < {pqq2) 1 ^ 2 \og D (qq2) 
and k", Z2 such that 

qqix+pzf + (k' + miqi)q = pz\ + k" qq 2l 



pz\ + k'q + (x + mi)qqi = pz\ + k"qq 2 . (25) 
As (pqq2) 1 ^ 2 \og D (qq 2 ) < n 1 / 3 log c ^ 5 n and n 1 / 3 log c ^ 5 n < Mi — mi, we have 

mi < x + mi < Mi . 
On the other hand, recall that [mi, Mi] C l*T + [0, L{\ (see (22)), we have 



{ls[ + r+ [mi, Mi]q qi } C l*S'l + r + [0, L 1 ]qq 1 (modqq 2 ) . 
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Thus 



ls[ + r + (x + mi)qqi e l*S 1 + r + [0, Li]qqi(modqq 2 ). 



(26) 



Combining (24), (25) and (26) we infer that there exist I elements oi, . . . , a; of S 1 , 
and there exist < u < L\ and v such that 



^ ^ a) + Qi -I h a; + r + uqq 1 = pz% + vqq 2 . 

i=i j=i 



Hence, Z)j=i a 5 + a i H h a ; + Q contains the AP {(pz% + vqq 2 ) + 002^2 10 < 

%2 < -^2}, completing Subcase 2.2. 

Finally, one checks easily that the number of elements of A" involved in the creation 
of pz% in all cases is bounded by (^(n 1 / 3 log c / 5 n) = o(|A'|), thus we may put all of 
them to A' without loss of generality. 



Here we consider the (easy) case when Q (in Lemma 2.4) has rank one. In this 
case, Sa> contains an AP Q = {r + qx\0 < x < L}, where L > n 2 / 3 (logn) c / 4 as in 
the first statement of Lemma 2.4. We are going to show that Q contains a number 
of the form pz 2 . 

Write r = pz 2 + tq for some < z < q. Since r is a sum of some elements of A', 
we have 



9. Proof of Theorem 1.5: The rank one case. 



< r < \A'\(n/p) < 



n 



4 / 3 (logn) c / 3 



P 



Thus 



-pq<t< 



n 



4 / 3 (logn) c / 3 



(27) 



pq 



The interval [t/pq, (t + L)/pq] contains at least two squares because 




> 10— + 20. 

pq 
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\2 <■ t+L 

implied that (since < z < g) 



Thus, we can find an integer x > such that ^ < x^ < (x + l) 2 < It is 



t < pqxp + 2pz a;o <t + L. (28) 

Set z := z + gx . We have 

pz 2 = pzp + + 2pz x ). 

On the other hand, by (28), the right hand side belongs to 

pz 2 + q[t, t + L]=pz 2 +tq + q[0, L] = r + q[0, L] = Q. 
Thus, Q contains pz 2 , completing the proof for this case. 

10. Proof of Theorem 1.5: The rank two case 



In this case, we assume that Sa 1 contains a proper GAP as in the second statement 
of Lemma 2.4. We can write 



Q = {r + q(qix 1 + q 2 x 2 ) |0 < X\ < L u < x 2 < L 2 , (qi,q 2 ) = 1} 

where 

• min(Li,L 2 ) > n 1/3 (logn) c / 4 , 

• iiL 2 > n(logn) c / 2 , 

y — p > 

• and r = pz 2 + tq for some integers t and < z < q. 

Since r is a sum of some elements of A', we have < r < " ' ( lo £") 1 , and so 

n 4 / 3 (logn) c / 3 



-pq<t< 



pq 



Without loss of generality, we assume that q 2 L 2 > q\L\. Because Q is proper, 
either q 2 > L\ or q\ > L 2 . On the other hand, if q 2 < L\ then L 2 < qi, which is 
impossible by the assumption. Hence, 
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92 > Li. 

Now we write Q = {pz 2 + q{qixi + q 2 x 2 + t)\0 < x\ < L\, < x 2 < L 2 , (qi,q 2 ) = 1} 
and notice that if we set w := zq + zq then 



pw 2 - pzq = p(z + qzf - pzl = q(pqz 2 + 2pz z). 
Thus if there is an integer z satisfies 

pqz 2 + 2pz Q z e {qix + q 2 y + 1\0 < x < L u < y < L 2 } (29) 



then pw 2 G Q, and we are done with this case. The rest of the proof is the 
verification of the following proposition, which shows the existence of a desired z. 

Proposition 10.1. There exists an integer z which satisfies (29). 



Proof (Proof of Proposition 10.1) The method is similar to that of Lemma 4.3, 
relying on Poisson summation. 

Set a := pq and b :— 2pz . Notice that since < z < q, < b < 2pq = 2a. Our 
task is to find a z such that 



az 2 + bz — qix — t = q 2 y for some < x < L\, < y < L 2 . 
Define (with foresight; see (31)) I x := [L1/8, Li/4] and 



j , = ^ gi^i/ 4 + ^ i/2 , 1 ^2-^2 + 91^1/8 + ^ /2 1 ] 

(Notice the that the lower bounds on L\, L 2 and the upper bound on pq guarantee 
that the expressions under the square roots are positive.) 

Since r + qq\L\ + qq 2 L 2 = pz^ + tq + q{qiL\ + q 2 L 2 ) G Q, it follows that (with 
max(Q) denoting the value of the largest element of Q) 



T ,0 p-'n^ilognf/ 3 n 4 / 3 (logn) c / 3 
q 2 L 2 + giLi/8 + t < max(g)/q < y - K -J^J- = L_^J . 



Thus 



SQUARES IN SUMSETS 



29 



1 (q 2 L 2 - q\L x /A)a 



-l 




q 2 L 2 +qiL 1 /8+t 




)■ 



(30) 



n 2/3( logn )C/6 



By the definitions of I x and I z , we have, for any x £ I x and z E I z 



< az 2 + bz — qix — t < a(z + l) 2 — qix — t < q 2 L 2 - 



(31) 



Thus, for any such pair of x and z, if az 2 + bz — q\x — t is divisible by q 2 , then 
y := (az 2 + bz — q\x — t)/q 2 is an integer in [l,^]- We are now using the ideas 
from Section 7, with respect to modulo q 2 and the intervals I Xl I z . 

Let q~i be the reciprocal of q\ modulo q 2 (recall that (qi,q 2 ) = 1). Let / be a 
function given by Lemma 4.1 with respect to the interval I x . For a given z £ I z , 
the number of x e I x satisfying (29) is at least N z , where 



By applying Poisson summation formula (8) and summing over z in I z we obtain 



It suffices to show that N > 0. Similar to the proof of Lemma 4.3, we will again 
show that the right hand side is dominated by the contribution at m = 0. By 
triangle inequality, we have 






Let 7 be a sufficiently large constant and let 



L' := 



8q 2 (log 92 ) 7 
Li 



We have 
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where 



\N--f(0)\I z \\<S 1 + S 2 

Q2 



Si: = y Ij/AllV e{ fr°z 2 +qibz-qit)m 

,4^ 92 92 iti. q2 



and 



|m|<£' z£-fz 



)|- 



To conclude the proof, we will show that both Si and S 2 arc o( g ,j )■ 



' /(0)|J. 

lau uum ui auu >_>2 a±c "I 

Estimate for Si. By the property of /, 



Si < V cxp(-5Vl^i/(8«2)|). 

02 — 

l m li 

By (17), and as 92 is large (q 2 > L\ > n 1 / 3 ), the inner sum is o(l), so 



*=o(«) 

92 

as desired. 

Estimate for S 2 . Let 9' = (91a, 92)- We can write 

qia = q'q[,q 2 = q'q'2 with (<?i, 92 ) = L 

Then 



5- < I / gj m * 2 , (gijg - 9i^)w , 

2 ~ Q2 Qo q 2 

HZ \m\<L' zeh H2 HZ 



By Lemma 4.2 there are absolute constants c, a such that 
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S 2 <c^(L'^\l7\(lognr + 



92 V Jq 2 



To show that S 2 = o( fmhl ), it suffices to show that 



L'(lognr =o(^\I7\) (34) 

and 

L' (log n) a = o(q' 2 ) (35) 
To verify (34), notice that by (30), we have 



Thus 



T \1 2 / T 2 T 



2 r 2 

2 



L' 2 (logn) 2 » ^f(logn) 2 «+ 2 V VL 292 n 2 / 3 (logn) c V 6 + 2 «+ 2 W' 

Since (Lii 2 ) 2 > (n(log n) c/2 ) 2 = n 2 log c n and L 2 g 2 = 0(max(Q)) = Oip^n^^lognf/ 3 ), 
the last formula is uj(1) if we set C sufficiently large compared to a and 7. This 
proves (34). 

As a result, 



/(0) T , /TjTf, 

L v \l z lo; 



, , 2 |(lognr= O (/(0)|7 2 |/g 2 ). 

52 

Now we turn to (35). Recall that (72 = q'q'2 an d 9' = q 2 ) = (a, q 2 ) (as qi and 
92 are co-primes). Thus 



, . 92 92 

9 2 > — = — • 

a pq 



To show (35), it suffices to show that 
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^=c(L' 2 (logn) 2 «) 
pq 

which (taking into account the definition of L') is equivalent to 



q 2 L 2 1 =cj(pqq 2 2 (logn) 2a +^). 
Multiplying both sides with L 2 q 2 l , it reduces to 



L 2 1 L 2 =u(pqq 2 L 2 {\ognf a + 2 ''). 

Now we use the fact that qq 2 L 2 = 0(max(Q)) = 0(p~ 1 n 4/3 (\og n) c/3 ) and the 
lower bounds LiL 2 > n(\ogn) c / 2 and L\ > n 1 / 3 (log n) c / 4 . The claim follows by 
setting C sufficiently large compared to a and 7, as usual. Our proof is completed. 
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